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In this writeup, we have described the procedures which could be useful in target detec- 
tion. We have also listed the elementary operations needed to implement these procedures. 
These operations could also be useful for other target detection methods. All of these oper- 
ations have a high degree of parallelism, and it should be possible to implement them on a 
parallel architecture to enhance the speed of operation. 

1 Preprocessing 

This is the first step, which is used to subtract the background and remove clutter as much 
as possible. 

1.1 Low-Stop Filtering 

• Useful in removing background in case it is uniform, or slowly varying over space. 

• Involves subtraction of a low pass-filtered image from the original image (or another 
low-pass filtered image with a smaller size, i.e. higher cut-off frequency). 

• Low pass filter can be implemented by hierarchical (pyramid) method [3] as follows: 

• The original image is denoted by f{x,y) where ( x,y ) are the pixel coordinates. This 
forms the level 0 of the pyramid, denoted by: 

/<»> = / (i) 

The level k + 1 is obtained from level k by low-pass filter h followed by down-sampling. 

= (| 2)(/i*/<‘>) (2) 

At a certain level k = n the process is stopped. The number n determines the size of 
the low-pass filter. 

g n = r ( 3 ) 
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Reverse process is carried to obtain level A; - 1 from level k by up-sampling, and filtering 
by another filter h' 

</*-!) = / l '*((T2)</W) (4) 

The low-pass filtered output is given by 


9 = 9 


( 0 ) 


(5) 


• Spatial Convolution with 2-D filter is given by: 

(/ * h){x, y) = Y [f(x ~ x' ,V ~ y') + h(x,y)] ( 6 ) 

x',y' 


The convolution is usually separable into two 1-D convolutions h x and h y where h — 
h x * h y . 


[f * h x * hy)(x , y ) — 


Y if( x ~ x 'yy- y ') + M*)l + Ms/') 


(7) 


The filter h used for low-pass filtering is given by: 

h x (x) = [^(-1) h x ( 0) Ml) M2)] = [133 l]/8 (8) 

The same filter is used for h y {y). h! is given by reflection of h\ i.e. h(x, y) = h(-x, -y) 
and is also separable. 

• Down-sampling and up-sampling are given by: 

((4-2)/)(x,v) = /(2l,2|0 (9) 

((f 2)/)(x, y) = / ix/2, y/2) for even x,y ;0 otherwise (10) 

• More efficient implementations are possible by doing down-sampling before low-pass, 
and up-sampling after high-pass. These are known as polyphase implementations [5]. 


1.2 Morphological Filtering 

• Useful in removing large-sized clutter from small-sized objects. 

• Involves subtraction between an image and its morphological opening or closing. 

fo = f ~(f om ) 

/c = (/ • m) - / (11) 

• Opening is an erosion followed by dilation, and closing is a dilation followed by erosion. 

(/ o m) = (/ © m) ® m 

(/ • m) = (/ ® m) © m (12) 
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• Dilation and erosion are defined by [4]: 

(/ © m)(x, y) = max{/(x - x\ y - y') + m(x',y')} 

•r y 

(/ e m){x, y ) = min{/(x + x', y + y) - m(x', y')} (13) 

x\y' 

where the values of / and m are assumed to be — oc outside the region of interest in 
order to make these terms redundant in max or min operations. 

• The mask size assumed at present is 5 x 5 but slightly bigger size may be needed. 

• If the mask m is separable in dimension, i.e. m = m x © m y the 2-D mask would be 

replaced by two 1-D masks, applied one after another. 

y 

(/©m)(x, y) = (/©m I )©m y = max{max{/(x - x', y - y') + m z (x')} + m y (y')} (14) 

y x / 

(/©m)(x,y) = (/ © m x ) © m y = min{min{/(x + x\y + y') - m x (x')} - m y {y')} (15) 

y' i 

• If m(x', y ') = 0 for the whole mask i.e. region of interest, the operations of erosion and 
dilation are reduced to max and min operations over the mask. 


2 Temporal Integration 

This is useful to enhance targets with low signal to noise ratio. We have studied three 
approaches for performing temporal integration. These can be applied one after another in 
the given order. 


2.1 Temporal Averaging 

• Initially, a number of frames would be averaged (or summed) to bring the maximum 
target image velocity to one pixel per number of summed frames. 

• A forgetting factor may be used to give larger weightage to more recent frames. 

• May be implemented in recursive or hierarchical fashion. 


Recursive implementation: 



H 

3 

II 

o 

(16) 


F(x, y; t ) = /(x, y; t) + qF(x, y; t - 1) 

(17) 

or 

F(x, y; t) = (1 - a)f{x , y; t) © aF(x, y; t - 1 ) 

(18) 

where 




• /(x, y; t ) is the value of pixel (x, y) in frame t 
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• F(x , t/; i) is the output of the algorithm at frame t obtained using the output F{x, y\ t - 
1) from the previous frame. 

• a is the forgetting factor between 0 (full forgetting) and 1 (no forgetting). 

Hierarchical implementation: 

• Level 0 represents the original image with to = t as the frame number: 

fo(x, y,t 0 ) = f{x,y,t) (19) 

• Level 1 is formed by summing two consecutive images. The frame rate at level 1 is 
reduced by half, and the frame number is denoted by t\ = to/2 where only even to is 
used. 

fi(x,y,ti) = fo{x t y,t 0 + l) + a f 0 {x,y\ t 0 ) (20) 

• Level k is formed using level A: - 1 as: 

fk(x,y,t k ) = / fc -i(x,y;i*_i + l)+a fc /*-i(^y;4-i) I h = t k -i/2 , t k -i even (21) 

This expression is equivalent to the weighted sum of 2* image frames, from equation 
(17) for t k = 1 and t = 2 k . 

2.2 Temporal Shift and Add 

• This is the generalization of the hierarchical procedure for temporal averaging to ac- 
count for the target image velocity. 

• For simplicity, we consider the target velocity to lie between 0 and 1 pixel per frame 
in x and y direction. Procedure for negative pixel velocity would be the mirror image 
of this. 

• At each hierarchical level k, the velocity can be resolved into 2 k sub-intervals per 
dimension, each called {u k ,v k ) representing the velocity around (u,v) = (u k ,v k )/2 k . 
Note that these sub-intervals are not mutually exclusive, but overlap each other. 

• The frame rate is reduced by half at each stage, so that the frame number t k = t/2 k 
for t divisible by 2 k . 

• Hence, we have 2 k x 2 k images corresponding to velocities from 0 to 1 in each dimension. 

The actual number is 4 times this, to account for negative velocities in each dimension. 

A pixel ( x,y ) in image (u k ,v k ) representing velocity around (u k ,v k )/ 2*, at frame t k is 
denoted by f k {x,y,u k ,v k -,t). 

• Images at level k are formed by shifting appropriately, and adding images at level k — 1 
according to the following equation: 

f k (x, y, u k , v k ; t k ) = /*_ t(x, y, u*-b w*_i ; t k -i + l)+a k f k -[(x-u' k _ l ,y-v' k _ l ] u k . i, **-i) 

(22) 
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where 


(m-i.i't-i) = (l“*/2J, L'- , */2J ) 

= (r«t/ 2 i.r»*/ 2 i) (23) 

• At each level, the frame rate reduces by half, but the number of states increases 4 times. 

Hence, the memory and computational complexity increases rapidly with the number of 
frames integrated and the process should be stopped before there is a resource crunch. 

• However, if sub-pixel velocity resolution is required, or the SNR is low, this stage could 
be helpful. 

• Fig. 1 shows the shift and add process for one dimension. This can be generalized to 
two dimensions as shown in Fig. 2. 

2.3 Dynamic Programming 

• To replace the temporal shift and add, after complexity has increased, dynamic pro- 
gramming can be used. 

• Dynamic programming method was used for target detection by Barniv [2] and Arnold et al. [1]. 

• This process does not increase the number of states or complexity any further. Instead 
of increasing the number of states at the higher level, a ‘maximum’ operation is used. 

• This process can be implemented recursively as: 

F(x,y;u n ,v n ;0) = 0 (24) 


F(x, y; u n , v n ; t n ) = f(x, y, u n , u n ; t n ) +a n max rruix f{x -u n - i, y-v n -j ; u n , v n ; t n - 1) 

1=0 j=0 

(25) 

where f(x, y; u n , v n \ t n ) is the value of pixel (x, y) in frame t n , F(x, y\ u n , v n ; t n ) is the 
output of the algorithm at frame t n obtained using the output F(x' ,y r ; u n , v n :t — 1) 
from the previous frame, and range ±1 is applied, so that u n and i as well as v n and j 
have same sign. 


3 Spatial Integration 

• All the above processes can be carried out at various levels of resolution to enable 
efficient detection of targets of all sizes. 

• The process involves convolution and down-sampling as performed in the hierarchical 
low-pass filter described in Sec. 1.1. 
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4 Elementary Operations 

The following elementary operations would be needed for the implementation of the above 
procedures, as well as some more algorithms. 

1. Spatial Convolution with 2-D or 1-D filter: This is useful for low-pass filtering and 
other linear filtering applications. The equations are given in Sec. 1.1 

2. Morphological dilation and erosion with 2-D or 1-D masks: This is useful for the 
morphological filtering to remove clutter. Maximum operator in dynamic programming 
can also be implemented as a special case of morphological dilation. The equations are 
given in Sec. 1.2. 

3. Spatio-Temporal Convolution: This is the convolution performed across two dimensions 
of space and one of time. In most cases, this would be separable into 3 parts: This can 
be useful for computing optical flow using spatial and temporal gradients. 

(f*h){x,y,t)= Y, [/(x - x', y - y', t - t')h{x' , y\ 0] ( 26 ) 

x'.y'.t' 

In separable case, this becomes: 

(f * h)(x,y,t) = {f * h x * h y * h t )(x,y,t) 

= £ £ fe l/(* - ^ - 0M*')1 MvoW) (27) 

t' m V* Lx' 

4. Pointwise unary operations: This is a pointwise function of a single image. No neigh- 
boring pixels are used to perform the operation. Examples of such operations include 
scaling, square, square-root, etc. These could be useful for many different algorithms. 

g(x,y) = A(f{x,y)) (28) 

5. Pointwise binary operations: This is a pointwise binary function of corresponding pixels 
in two images. Examples of these operations include add, subtract, multiply, divide, 
max, min, etc. 

g{x, y ) = B{f\ {x, y), / 2 (x, y)) (29) 

6. Image warping: This is used to transform an image from one coordinate system to 
another. It is based on the coordinate transformation: 

(*',!/') = T(x,y ) or (*,y) = T~ l (x', y’) (30) 

where ( x,y ) and ( x‘,y ') denote the coordinates in the original and the transformed 
image. If T~ l is an integer mapping, we have: 

g(x',y') = f(T-'(x',y')) (31) 

If T~ l is a real mapping, we can use bilinear transformation to interpolate the value 
of g{x',y'). 

This list of operations is not exhaustive; we may come across more operations as we 
explore methods for target detection. 
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Figure 1 : Shows the process of hierarchical shifting and adding of image frames in one dimension. Level 0 includes single frames. 
At level 1 , two frames are shifted appropriately, and added with a forgetting factor a. Two velocity states are produced. 

At level 2, the states from level I are added after shifting, with a forgetting factor of a A 2. This gives a forgetting 
factor of a if we consider the individual frames. Four velocity states are obtained. 









Figure 2: Shows the states obtained in case of 2-D shift and add. The pixel (x,y) is the top-right pixel, and other pixels are displaced 
with respect to it. In each pixel, the number denotes the number of frames added. The last frame is in pixel (x,y) and other frames as 
one goes backwards are along the line, corresponding to the trajectory of the target. 




















